The acquisition of a unification-based generalised categorial grammar
نویسنده
چکیده
The purpose of this work is to investigate the process of grammatical acquisition from data. In order to do that, a computational learning system is used, composed of a Universal Grammar with associated parameters, and a learning algorithm, following the Principles and Parameters Theory. The Universal Grammar is implemented as a Unification-Based Generalised Categorial Grammar, embedded in a default inheritance network of lexical types. The learning algorithm receives input from a corpus of spontaneous child-directed transcribed speech annotated with logical forms and sets the parameters based on this input. This framework is used as a basis to investigate several aspects of language acquisition. In this thesis I concentrate on the acquisition of subcategorisation frames and word order information, from data. The data to which the learner is exposed can be noisy and ambiguous, and I investigate how these factors affect the learning process. The results obtained show a robust learner converging towards the target grammar given the input data available. They also show how the amount of noise present in the input data affects the speed of convergence of the learner towards the target grammar. Future work is suggested for investigating the developmental stages of language acquisition as predicted by the learning model, with a thorough comparison with the developmental stages of a child. This is primarily a cognitive computational model of language learning that can be used to investigate and gain a better understanding of human language acquisition, and can potentially be relevant to the development of more adaptive NLP technology.
منابع مشابه
The Use of Default Unification in a System of Lexical Types
In this paper we describe the encoding of a Unification-Based Generalised Categorial Grammar for English, in terms of a default inheritance network of types, implemented with YADU, which is an order independent default unification operation on typed feature structures. We then propose to use this framework to encode a Universal Grammar (UG) and associated parameters, following the Principles an...
متن کاملOn the computation of joins for non associative Lambek categorial grammars
This paper deals with an application of unification and rewriting to Lambek categorial grammars used in the field of computational linguistics. Unification plays a crucial role in the acquisition of categorial grammar acquisition, as in [Kan98] ; a modified unification has been proposed [For01a] in this context for Lambek categorial grammars, to give an account of their logical part. This modif...
متن کاملThe Acquisition of Word Order by a Computational Learning System
The purpose of this work is to investigate the process of grammatical acquisition from data. We are using a computational learning systern that is composed of a Universal Grammar with associated parameters, and a learning algorithm, following the Principles and Parameters Theory. The Universal Grammar is implemented as a Unification-Based Generalised Categorial Grammar, embedded in a default in...
متن کاملPhD Proposal – The Lexicon in Combinatory Categorial Grammar: An Explanatory Theory of Verbal Categories in Natural Languages
The aim of this project is to elaborate a theory of natural language lexicons for Combinatory Categorial Grammar (CCG), a mildly contextsensitive, polynomially time-parsable variant of categorial grammar. This theory will have both a descriptive aspect, exploring the use of appropriate formal machinery for expressing lexical generalisations, and an explanatory aspect, accounting for observed pa...
متن کاملConjoinability and unification in Lambek categorial grammars
Recently, learning algorithms in Gold’s model have been proposed for some particular classes of classical categorial grammars [Kan98]. We are interested here in learning Lambek categorial grammars. In general grammatical inference uses unification and substitution. In the context of Lambek categorial grammars it seems appropriate to incorporate an operation on types based both on deduction (Lam...
متن کامل